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The genetic information, carried on MRNA 6 of feline infectious peritonitis virus (FIPV) strain 79-1146, was deter- 
mined by sequence analysis of cDNA clones derived from the 3’ end of the FIPV genome. Two ORFs were found, 
encoding polypeptides of 11K (ORF-1) and 22K (ORF-2). The FIPV sequence was compared to the 3’ end sequence of 
transmissible gastroenteritis virus (TGEV). ORF-1 has a homologous counterpart (ORF-X3) in the TGEV genome; both 
ORFs are located at the same position relative to the nucleocapsid gene. However, as a result of an in-frame insertion 
or deletion, ORF-1 is 69 nucleotides larger than ORF-X3. A similar event has occurred immediately downstream of 
ORF1: a 624-nucleotide segment, containing the complete ORF-2, is absent in the TGEV sequence. Most sequence 
similarity (98.5%) was found in the 3’ noncoding sequences. ORF-X3 and ORF-1 are preceded by the sequence AAC- 
TAAAC, which is assumed to be the transcription-initiation signal in FIPV and TGEV (P. A. Kapke and D. A. Brian (1986) 
Virology 151, 41-49). By S1 nuclease analysis, the 5’ end of FIPV RNA 6 was mapped immediately upstream of this 
sequence. A 700-nucleotide TGEV-specific RNA was found by cross-hybridization with an FIPV.3' end probe, suggesting 
that TGEV ORF-X3 is also carried on a separate MRNA. The differences at the 3’ ends of the FIPV and TGEV genomes 


may be the result of RNA recombination events. 


INTRODUCTION 


Coronaviruses, a group of enveloped, positive- 
stranded RNA viruses, have attracted considerable in- 
terest because of their unusual replication strategy. In 
the infected cell, there are five to seven subgenomic 
mRNAs which form a 3’ coterminal nested set: they 
have common 3’ ends but extend for different lengths 
in the 5! direction. In addition, the RNAs share a short 
5! leader sequence, which is fused to the RNA ‘‘body’”’ 
via discontinuous transcription (Spaan ef a/., 1983; Lal 
et al., 1984; Brown et a/., 1984). Translation of each 
RNA is thought to be restricted to the open reading 
frames (ORFs) at the 5’ end that are not present in the 
smaller RNAs (for review see Siddell et a/., 1983). 

/n vitro translation of the viral mRNAs (Rottier ef a/., 
1981; Siddell, 1983; Stern and Sefton, 1984; Jacobs 
et al., 1986; de Groot et a/., 1987a) and the sequence 
analysis of coronavirus genomes (Boursnell eft a/., 
1987; Rasschaert et a/., 1987; Armstrong et a/., 1984; 
Skinner and Siddell, 1985; Skinner et a/, 1985; 
Schmidt et a/., 1987; Luytjes et a/., 1987) have allowed 
the construction of genomic maps. The relative posi- 
tion of the genes encoding the structural proteins is 
conserved on the genomes of these viruses. However, 
differences in the genomic maps indicated that other 
transcription units have been lost, gained, or translo- 
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cated as the coronaviruses diverged (de Groot et a/., 
1987a). 

Feline infectious peritonitis virus (FIPV) and transmis- 
sible gastroenteritis virus (TGEV) of swine belong to the 
same antigenic cluster (Pedersen et a/., 1978; Horzinek 
etal., 1982; Siddell et a/., 1983) and are closely related: 
sequence analysis of their peplomer genes revealed up 
to 93% sequence identity Jacobs et a/., 1987). Despite 
this close relationship, TGEV and FIPV differ in their ge- 
nomic organization (de Groot et a/., 1987a). 

TGEV is generally reported to specify six poly(A)-con- 
taining RNAs, the smallest of which (1.9 kb) encodes 
the nucleocapsid (N) protein Jacobs et a/., 1986). In 
contrast to other coronaviruses, like infectious bronchi- 
tis virus (IBV) and mouse hepatitis virus (MHV), the nu- 
cleocapsid gene is not the 3’-most ORF. A short ORF 
(ORF-X3), potentially encoding a polypeptide of 9.1K, 
is found further downstream (Kapke and Brian, 1986; 
Rasschaert et a/., 1987). Although the presumptive 
transcription-initiation signal, AACTAAAC, is present 
at the 5’ end of ORF-X3, it is not clear whether this ORF 
is carried on a separate MRNA (Uacobs et a/., 1986; 
Kapke and Brian, 1986; Rasschaert et a/., 1987). 

For FIPV an RNA of 2.8 kb (RNA 5) encodes the N 
protein, while the smallest RNA (RNA 6) has a length of 
about 1450 bp. These findings indicated the presence 
of a large insertion at the 3’ end of the FIPV genome 
as compared to the TGEV genome. In this report we 
describe the cloning and sequence analysis of the 3’ 
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end of the FIPV strain 79-1146 genome; a detailed 
comparison with the 3’ end of the TGEV genome is pre- 
sented. 


MATERIALS AND METHODS 
Selection and analysis of cDNA clones 


cDNA clones containing sequences derived from the 
3’ end of the FIPV genome were selected from a ‘‘ran- 
dom’”’ cDNA library of FIPV 79-1146 genomic RNA (de 
Groot et a/., 1987b) by hybridization to sucrose-gradi- 
ent purified, °@P-labeled FIPV RNA 6 in 50% formamide, 
5x SSC, 5x Denhardt's , 100 ng/ml salmon sperm 
DNA at 42°. Recombinant DNA techniques were per- 
formed by standard methods (Maniatis et a/., 1982). 
Sequence analysis was carried out using the dideoxy- 
nucleotide chain termination procedure (Sanger et al., 
1977). Sequence data were assembled and analyzed 
by using the computer programs by Staden (1986). S1 
nuclease analysis and Northern blot analysis were per- 
formed as described (de Groot et a/., 1987a,b). 


RESULTS 


Isolation and characterization of recombinants 
containing sequences derived from the 3’ end of the 
FIPV genome 


The preparation of a cDNA library of FIPV genomic 
RNA in pUC9 was described previously (de Groot et a/., 
1987b). Recombinants containing sequences derived 
from the 3’ end of the FIPV genome were isolated by 
colony hybridization with an RNA fraction enriched for 
RNA 6 (fraction 28, de Groot et a/., 1987a). The plas- 
mids pB12, pC 12, and pE7 were selected for sequence 
analysis. In Fig. 1 the sequence strategy is outlined. 
Three major ORFs were identified (Fig. 1). The 5’-most 
ORF could be identified as the 3’ end of the FIPV nu- 
cleocapsid gene, the sequence of which was 64% 
identical to the corresponding TGEV sequence (not 
shown). Figure 2 shows the nucleotide sequence and 
the predicted amino acid sequences of the region 
downstream of the N gene. As shown in Fig. 1, this 
sequence was determined on both strands and on two 
independent cDNA clones, except for the 3-most 68 
nucleotides. 

ORF-1 (positions 49 to 375) predicts a protein of 108 
residues. In the genomic sequence this ORF overlaps 
with the N gene. The first AUG codon (position 49) is 
followed by the sequence AACTAAAC; a second AUG 
codon is present at position 70 (Fig. 2). ORF-2 (posi- 
tions 380 to 1000) could encode a polypeptide of 206 
residues. 


Comparison with the 3’ end of the TGEV genome 


Figure 3a shows a dot matrix comparison of the 3’ 
end of the FIPV and TGEV genomes (Kapke and Brian, 
1986). The highest sequence similarity (98.5%) was 
found in the 3’ noncoding regions. ORF-1 is 78% identi- 
cal to the TGEV ORF-X3 but contains 69 nucleotides in 
addition (indicated by a dashed line in Fig. 2). The AUG 
start codon of ORF-X3 corresponds to the second AUG 
codon of ORF-1. 

A 624-nucleotide segment positions 376-1000) im- 
mediately downstream of ORF-1 is absent in the TGEV 
sequence. Strikingly, this segment corresponds ex- 
actly to ORF-2. A schematic alignment of the TGEV and 
FIPV sequences is shown in Fig. 3b. 

None of the recombinants isolated from our cDNA 
library contained the poly(A) tail, probably because the 
cDNA synthesis was randomly primed by calf thymus 
DNA pentamers (de Groot et a/., 1987b). If aligned to 
the TGEV sequence, the most 3’ located clone, E7, 
ends just one nucleotide upstream of the poly(A) tail. 


Localization of the 5’ end of the presumptive RNA 
body of RNA6 


To determine the 5’ end of RNA 6, we used S1 
nuclease analysis. An M13-recombinant phage con- 
taining the virus-sense strand of the 950-bp Pstl-7aql 
fragment (Fig. 1) served as a template to prepare a uni- 
formly labeled probe. This probe was hybridized to su- 
crose-gradient purified RNA 6, followed by S1 nuclease 
digestion. A fragment of 514 nucleotides was pro- 
tected (Fig. 4a). The precise length was determined in 
a sequencing gel (Fig. 4b). This indicates that the 5’ end 
of the RNA 6 body maps at position 60, immediately 
upstream of the AACTAAAC box. Consequently, the 
AUG codon at position 49 is not present in RNA 6. 
Furthermore, these results suggest that the body of 
RNA 6 has a length of 1212 nucleotides, provided that 
there are no additional insertions. Assuming an RNA 
leader sequence of 60-70 nucleotides (Spaan et a/., 
1983; Lai et a/., 1984; Brown et a/., 1984) and a poly(A) 
tail of about 100 nucleotides we arrive at a predicted 
length of approximately 1400 nucleotides. Previously, 
RNA 6 was estimated to be 1600 nucleotides (de Groot 
et al., 1987a). By using gels with a better resolution in 
this MW range, we have now measured a length of 
about 1450 nucleotides (not shown). 

Since the AACTAAAC box preceding ORF-1 appar- 
ently is used as ja signal for initiation of transcription, 
we expected this also to be the case for the AAC- 
TAAAC sequence preceding the TGEV ORF-X3. Figure 
5 shows that in a Northern blot of oligo (dT)-selected 
RNAs of TGEV-infected cells, an RNA of about 700 nu- 
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Fic. 1. Restriction endonuclease map and sequence strategy for the FIPV cDNA clones E7, B12, and C12. P, Pstl; S, Sphl; T, Taql; and X, 
Xbal. Sau3A and Rsa | restriction fragments (not shown) were also used to complete the sequence. The upper panel presents a schematic 
diagram of the possible open reading frames obtained when translating the nucleotide sequence as virus-sense RNA. The location of the 


nucleocapsid gene (N), ORF-1, and ORF-2 are indicated. 


cleotides can be detected by cross-hybridization with 
the 1600-bp Pstl fragment of clone B12. 


DISCUSSION 


Differences in the mRNA sets of TGEV Purdue and 
FIPV 79-1146 indicated the presence of large inser- 
tions at the 3’ end of the FIPV genome (de Groot et a/., 
1987a). We have characterized these insertions by se- 
quence analysis of recombinants which had been iso- 
lated from a FIPV-specific cDNA library by hybridization 
with the smallest FIPV mRNA (RNA 6). 

RNA 6 carries two ORFs, encoding polypeptides of 
11K (ORF-1) and 22K (ORF-2). In the genomic se- 
quence, ORF-1 overlaps with the 3’ end of the N gene. 
However, by S1 nuclease analysis it was shown that 
only sequences downstream of the N gene are con- 
tained in MRNA 6. Consequently, in this RNA only the 
second AUG codon of ORF-1 is available for transla- 
tion-initiation. 

The 5’ end of the body of RNA 6 was mapped imme- 
diately upstream of the sequence AACTAAAC, the pre- 
sumptive transcription-initiation signal in the FIP/TGE 
viruses. This consensus sequence is not present be- 
tween ORF-1 and ORF-2. Furthermore, there are no in- 
dications for an RNA smaller then RNA 6. Therefore, if 


both ORFs are to be translated, RNA 6 must function 
as a bicistronic MRNA. The start codon of ORF-1 and 
the two internal, out-of-frame AUGs at positions 86 and 
188 are in an unfavorable context for translation-initia- 
tion, while the AUG of ORF-2 ranks among the most 
efficient start codons. According to the scanning hy- 
pothesis (Kozak, 1986a,b) this arrangement would fa- 
vor translation of ORF-2. Translation of ORF-2 could 
also occur via reinitiation (Peabody and Berg, 1986; 
Peabody et a/., 1986). Coronavirus mRNAs containing 
two or three ORFs in the 5’ ‘‘unique’’ segment have 
previously been described for MHV (Skinner et a/., 
1985) and IBV (Boursnell et a/., 1985). Recently, Smith 
et al. (1987) provided evidence for jn vivo expression of 
a ‘'downstream'' ORF of IBV RNA D. 

ORF-1 is homologous to the TGEV ORF-X and pres- 
ent at the same location relative to the N gene. How- 
ever, due to an in-frame insertion in FIPV or to a dele- 
tion in TGEV, ORF-1 is 69 nucleotides longer. The 
predicted ORF-X3 product contains hydrophobic 
segments of about 25 residues at the N- and C-termi- 
nus; these segments are separated by a hydrophilic 
central region (Kapke and Brian, 1986). The 23-residue 
insert in the ORF-1 product enlarges this hydrophilic 
region. Moreover, due to a point mutation, ORF-1 con- 
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N 


EAYTDVFODODTQVEMIODEVTNY* ORF 1 
MROLURTKRMLVFVHAVLVTALILLUG 

TGGAGGCATACACAGATGTGTTTGATGACACACAGGTTGAGATGATTGATGAGGTTACGAACTAAACGCATGCTOGTITTOGTCCATGCIGIACTTIGTAACAGCTTTAATCTTACTACTA 
50 4 100 


L D SF K N -~------------------------------+------------+--------- 
IGRIQLLERLLLSHLL@LITVSNVLGVPDSSLREVNCLQLL 
ATTGGTAGAATCCAATTACTAGAAAGGTTGTTACTCAGTCATCIGCTTAATCTTACAACAGTCAGTAATGTTTTAGGIGIGCCTGACAGTAGTCIGCGIGTAAATTGITTGCAGCITTTG 

170 220 


Beseeol eee ae YRS K Cc R 
KPDCLDFNILHKVLAETRLLVVVLURVIFLVLLGFSCYTLE 
AAACCAGACTGCCTIGATTTTAATATCTTACATAAAGTTTTAGCAGAAACCAGGTTACTAGTAGTAGTACTGOGAGTGATCTTTCTAGTTCTICTAGGGTTTICCIGCTATACATIGTTG 


290 340 
iia > a ORF 2——> 
GALF* MIvvVtILVCtIFLANGIKATAVQNODULHEHPVLUTWOD LE 
GGTGCATTATTTTAACATCATGATTGTTGTAATCCTIGTGTIGTATCITTITTIGGCTAATGGAATTAAAGCTACTGCTGTIGCAAAATGACCTICATGAACATCCOGTTCTTACCTGGGATTT 
410 460 


LQHFIGHTULUYtItT’TTAHQVLEALPLGSR VECEGIEGEFNCTWPGF 
ATTACAGCATTTCATAGGACATACCCTCTACATTACAACACACCAGGTCTTAGCACTACCGCTTGGATCTCGTGTTGAGTGIGAGGGTATCGAAGGTTTCAATTGCACATGGCCTGGCTT 
530 580 


QDPAHODHtIODF YFDLUS NPFYS FVODNFYtIVSEGNQRINULBURLUV 
TCAAGATCCTGCACATGATCATATIGATTTCIACITTGATCTITCTAATCCITTCTATICATTTGIAGATAATTTITATATTGTAAGTGAGGGAAATCAAAGAATCAATCTCAGATTGGT 
650 700 


GAVPKQKRULNVGC§HTS FAVODLPFGIQIY HDRODFQHPVODGR 
CAAAACAAAAGAGATTAAATGTIGGTIGTCATACATCATTTGCTGTIGATCTTCCATTTGGGATTCAGATATACCATGACAGGGATTTTCAACACCCTGTITGATGGCAG 
770 820 


HL DCtTHRVYFVK YC PHNLHG YcCFNERLUKV Y DUK QFRS KK V 
ACATCTAGATTGTACTCACAGAGTGTACTTIGTIGAAGTACTGTCCACATAACCIGCATGGTTATTGCTTTAATGAGAGGCTGAAAGTTTATGACTTGAAGCAATTCAGAAGCAAGAAGGT 
890 940 


FDK INQHHKTEL®* 
CTTCGACAAAATCAACCAACATCATAAAACTGAGTTATAAGGCAACCCGATGTCTAAAACTGGTCTTTCCGAGGAATTACGGGTCATCGCGCTGCCTACTCTIGTACAGAATGGTAAGCA 
1010 1060 


CGTGTAATAGGAGGTACAAGCAACCCTATTGCATATTAGGAAGTTTAGATTTGATTTIGGCAATGCTAGATTTAGTAATTTAGAGAAGTTTAAAGATCOGCTATGACGAGCCAACAATGGA 
1130 1180 


AGAGCTAACGTCTGGATCTAGTGATTGTTTAAAATGTAAAATTGTITTGAAAATTTTOCTTTTGATAGTGATG 
1250 

Fig. 2. The nucleotide sequence and the deduced amino acid sequences for the 3’ end of the FIPV strain 79-1146 genome. Only the extreme 
3’ end of the nucleocapsid gene (N) is presented. The presumptive transcription-initiation signal is underlined. An arrow indicates the 6’ end of 
the “‘body’' of RNA 6 as determined by $1 analysis. The presumptive initiating methionine of ORF-1 is boxed. Stop codons are indicated by 
asterisks. Differences in the amino acid sequences of ORF-1 and TGEV ORF-X3 (Kapke and Brian, 1987) are also shown; the 69 nucleotide 
segment, which is absent in TGEV ORF-X3, is indicated by a dashed line. Base 376 to base 1000 are also not present in the TGEV sequence. 
The potential N-glycosylation site in ORF-1 is indicated by encircling the asparagine residue. 


tains a potential N-glycosylation site (Fig. 2) which is mined by Rasschaert et a/. (1987) and Britton et a/. 
absent in the ORF-X3 sequence as determined by (1988) contain a potential glycosylation site located 
Kapke and Brian (1986). The ORF-X3 sequences deter- four amino acid residues upstream of the ORF-1 glyco- 
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Fic. 3. (a) Dot matrix comparison of the nucleotide sequence of the 3’ ends from FIPV strain 79-1146 and TGEV strain Purdue (Kapke and 
Brian, 1986). A schematic presentation of the results is depicted in (b), The ORFs are indicated by bars, black bars represent conserved seg- 
ments, insertions are depicted by white bars. The arrow indicates the 5’ end of the ‘body’ of RNA 6. 


sylation site. The features of the ORF-1 and ORF-X3 
products are characteristic for membrane proteins. 
ORF-1 and ORF-X3 may encode minor structural pro- 
teins which have not yet been detected because of 
their small size. 

Like ORF-1, TGEV ORF-X3 is preceded by the pre- 
sumptive transcription-initiation signal AACTAAAC 
(Kapke and Brian, 1986). However, it was unclear 
whether in TGEV this ORF is contained in a separate 
RNA (Kapke and Brian, 1986; Jacobs et a/., 1986; Ras- 
schaert et a/., 1987). Rasschaert et a/. (1987) did not 
detect such RNA species in Northern blots, but could 
have missed it since low percentage agarose gels were 


used. By cross-hybridization with an FIPV 3’ end probe, 
we detected a TGEV-specific, poly(A)-containing RNA 
species of about 700 nucleotides in Northern blots. Ja- 
cobs et a/. (1986) previously observed this RNA spe- 
cies in TGEV-infected cells, but considered it host spe- 
cific. However, its synthesis was not affected by ac- 
tinomycin D Jacobs et a/., 1986), which is a strong indi- 
cation for virus specificity. 

ORF-2 is located on a 624-nucleotide segment, 
which is absent at the 3’ end of the TGEV genome. A 
comparison with the partial TGEV sequence deter- 
mined by Rasschaert et a/. (1987) showed that the 
TGEV genome does not contain an ORF-2 homolog 
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downstream of the peplomer gene. Moreover, ORF-2- 
specific probes did not cross-hybridize with the TGEV 
mRNA 1 (not shown), indicating the absence of se- 
quences related to ORF-2 in the remaining part of the 
TGEV genome. ORF-2 is not related to genes of mouse 
hepatitis virus (MHV), infectious bronchitis virus (IBV) or 
to any of the sequences in the NBRF and Swiss protein 
databases. The predicted ORF-2 product is predomi- 
nantly hydrophilic but contains a hydrophobic segment 
of 12 residues at the N-terminus. 

ORF-2 and the 69 nucleotide segment in ORF-1 may 
have been deleted in TGEV, but an intriguing possibility 
is that FIPV has acquired these sequences by RNA re- 
combination. Homologous recombination in vitro has 
been described for the MHV strains A59 and JHM (Ma- 
kino et a/., 1986). The remarkable sequence diver- 
gence at the 5’ ends of the peplomer genes of TGEV 
and FIPV suggests that similar events may also occur 
in vivo Uacobs et a/., 1987). Recently, Luytjes et al. 
(1988) discovered a striking sequence similarity be- 
tween a pseudogene contained in RNA 2 of MHV Adg 
and the hemagglutinin (HA) gene of influenza virus, 
type C. This finding is best explained by a nonhomolo- 
gous recombination event. Conceivably, uptake of ge- 
netic information via nonhomologous recombination 
may account for FIPV ORF-2 and the various nonre- 
lated ORFs in other coronaviruses. 
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Fic. 4. (a) S1 nuclease mapping of the 5’ end of the FIPV RNA 
6. The uniformly labeled, anti-sense strand of the 950 bp Psti—Taq! 
fragment was used as a probe. After S1 nuclease digestion the pro- 
tected fragments were analyzed in a 2% agarose gel. Lane 1, hybrid- 
ization to total poly(A)-containing RNA extracted from FIPV-infected 
cells; lane 2, hybridization to yeast tRNA; lane 3, hybridization to su- 
crose-gradient purified RNA 6 (fraction 28; de Groot et a/., 1987a). 
Sau3A digested pUC 8 was used as a molecular weight marker (lane 
m). (b) the precise length of the protected fragment was determined 
in a sequencing gel. Lane 1, marker; lane 2, protected fragment after 
hybridization to purified RNA 6. 
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Fig. 5. Northern blot analysis of total poly(A)-containing RNA iso- 
lated from FIPV and TGEV-infected cells. The RNAs were separated 
in a 1.5% agarose—formaldehyde gel, blotted to a nylon membrane, 
and cross-hybridized to the FIPV 1600-bp Pst fragment of plasmid 
B12. The molecular weights of the FIPV RNAs are presented. An 
arrow indicates the 700-nucleotide RNA species of TGEV. 
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